Strings

Strings from C ( char* )

  • Syntax:

char foo [20];
  • It's an array of char .

Initialization
  • By convention, the end of strings represented in character sequences is signaled by a special character: the null character, whose literal value can be written as \0  (backslash, zero).

  • Using char s :

    char my_word[] = { 'H', 'e', 'l', 'l', 'o', '\0' };
    
    • Declares an array of 6 elements of type char  initialized with the characters that form the word "Hello" plus a null character '\0' at the end.

  • Using string literals :

    • Sequences of characters enclosed in double-quotes (") are literal constants. And their type is, in fact, a null-terminated array of characters. This means that string literals always have a null character \0  automatically appended at the end.

    char my_word[] = "Hello";
    // is equivalent to
    char my_word[] = { 'H', 'e', 'l', 'l', 'o', '\0' };
    
Reassigning values
  • Because string literals are regular arrays, they have the same restrictions as these, and cannot be assigned values.

  • This is a major reason people use pointers.

  • This is not valid :

    my_word   = "Bye";
    my_word[] = "Bye";
    my_word   = { 'B', 'y', 'e', '\0' };
    
  • This is valid :

    my_word[0] = 'B';
    my_word[1] = 'y';
    my_word[2] = 'e';
    my_word[3] = '\0';
    
char*  or char foo[]
  • Check the "Arrays from C -> T*  or T foo[] " section.

Strings from <string>

  • Inside #include <string> .

  • The string  class is a compound type. Compound types are used in the same way as fundamental types: the same syntax is used to declare variables and to initialize them.

#include <iostream>
#include <string>

int main ()
{
  std::string my_string;
  my_string = "This is a string";
  std::cout << my_string;
  return 0;
}
Initialization
  • string s can be initialized with any valid string literal, just like numerical type variables can be initialized to any valid numerical literal.

string my_string = "This is a string";
string my_string ("This is a string");
string my_string {"This is a string"};

Ginger Bill's String

struct String {
    u8 *  text;
    isize len;

    u8 const &operator[](isize i) const {
        GB_ASSERT_MSG(0 <= i && i < len, "[%td]", i);
        return text[i];
    }
};
////////////////////////////////////////////////////////////////
//
// gbString - C Read-Only-Compatible
//
//
/*
Reasoning:

    By default, strings in C are null terminated which means you have to count
    the number of character up to the null character to calculate the length.
    Many "better" C string libraries will create a struct for a string.
    i.e.

        struct String {
            Allocator allocator;
            size_t    length;
            size_t    capacity;
            char *    cstring;
        };

    This library tries to augment normal C strings in a better way that is still
    compatible with C-style strings.

    +--------+-----------------------+-----------------+
    | Header | Binary C-style String | Null Terminator |
    +--------+-----------------------+-----------------+
             |
             +-> Pointer returned by functions

    Due to the meta-data being stored before the string pointer and every gb string
    having an implicit null terminator, gb strings are full compatible with c-style
    strings and read-only functions.

Advantages:

    * gb strings can be passed to C-style string functions without accessing a struct
      member of calling a function, i.e.

          gb_printf("%s\n", gb_str);

      Many other libraries do either of these:

          gb_printf("%s\n", string->cstr);
          gb_printf("%s\n", get_cstring(string));

    * You can access each character just like a C-style string:

          gb_printf("%c %c\n", str[0], str[13]);

    * gb strings are singularly allocated. The meta-data is next to the character
      array which is better for the cache.

Disadvantages:

    * In the C version of these functions, many return the new string. i.e.
          str = gb_string_appendc(str, "another string");
      This could be changed to gb_string_appendc(&str, "another string"); but I'm still not sure.

    * This is incompatible with "gb_string.h" strings
*/
gb_printf("f->fullpath '%s' f->directory '%s'\n", f->fullpath.text, f->directory.text);

printf("f->directory classic '%.*s'\n", (int)f->directory.len, (char *)f->directory.text);

GB_PANIC("\n\tError in: %s, missing value '%.*s' in module %s\n",
            token_pos_to_string(e->token.pos), LIT(e->token.string), m->module_name);